132 research outputs found

    Monte-Carlo Robot Path Planning

    Full text link
    Path planning is a crucial algorithmic approach for designing robot behaviors. Sampling-based approaches, like rapidly exploring random trees (RRTs) or probabilistic roadmaps, are prominent algorithmic solutions for path planning problems. Despite its exponential convergence rate, RRT can only find suboptimal paths. On the other hand, RRT∗\textrm{RRT}^*, a widely-used extension to RRT, guarantees probabilistic completeness for finding optimal paths but suffers in practice from slow convergence in complex environments. Furthermore, real-world robotic environments are often partially observable or with poorly described dynamics, casting the application of RRT∗\textrm{RRT}^* in complex tasks suboptimal. This paper studies a novel algorithmic formulation of the popular Monte-Carlo tree search (MCTS) algorithm for robot path planning. Notably, we study Monte-Carlo Path Planning (MCPP) by analyzing and proving, on the one part, its exponential convergence rate to the optimal path in fully observable Markov decision processes (MDPs), and on the other part, its probabilistic completeness for finding feasible paths in partially observable MDPs (POMDPs) assuming limited distance observability (proof sketch). Our algorithmic contribution allows us to employ recently proposed variants of MCTS with different exploration strategies for robot path planning. Our experimental evaluations in simulated 2D and 3D environments with a 7 degrees of freedom (DOF) manipulator, as well as in a real-world robot path planning task, demonstrate the superiority of MCPP in POMDP tasks.Comment: Accepted: RA-L & IROS 202

    Hybrid control trajectory optimization under uncertainty

    Get PDF
    Trajectory optimization is a fundamental problem in robotics. While optimization of continuous control trajectories is well developed, many applications require both discrete and continuous, i.e. hybrid controls. Finding an optimal sequence of hybrid controls is challenging due to the exponential explosion of discrete control combinations. Our method, based on Differential Dynamic Programming (DDP), circumvents this problem by incorporating discrete actions inside DDP: we first optimize continuous mixtures of discrete actions, and, subsequently force the mixtures into fully discrete actions. Moreover, we show how our approach can be extended to partially observable Markov decision processes (POMDPs) for trajectory planning under uncertainty. We validate the approach in a car driving problem where the robot has to switch discrete gears and in a box pushing application where the robot can switch the side of the box to push. The pose and the friction parameters of the pushed box are initially unknown and only indirectly observable

    Projections for Approximate Policy Iteration Algorithms

    Get PDF
    Approximate policy iteration is a class of reinforcement learning (RL) algorithms where the policy is encoded using a function approximator and which has been especially prominent in RL with continuous action spaces. In this class of RL algorithms, ensuring increase of the policy return during policy update often requires to constrain the change in action distribution. Several approximations exist in the literature to solve this constrained policy update problem. In this paper, we propose to improve over such solutions by introducing a set of projections that transform the constrained problem into an unconstrained one which is then solved by standard gradient descent. Using these projections, we empirically demonstrate that our approach can improve the policy update solution and the control over exploration of existing approximate policy iteration algorithms

    Polymer-coated bioactive glass S53P4 increases VEGF and TNF expression in an induced membrane model in vivo

    Get PDF
    The two-stage induced-membrane technique for treatment of large bone defects has become popular among orthopedic surgeons. In the first operation, the bone defect is filled with poly(methyl methacrylate) (PMMA), which is intended to produce a membrane around the implant. In the second operation, PMMA is replaced with autograft or allograft bone. Bioactive glasses (BAGs) are bone substitutes with bone-stimulating and angiogenetic properties. The aim of our study was to evaluate the inductive vascular capacity of BAG-S53P4 and poly(lactide-co-glycolide) (PLGA)-coated BAG-S53P4 for potential use as bone substitutes in a single-stage induced-membrane technique. Sintered porous rods of BAG-S53P4, PLGA-coated BAG-S53P4 and PMMA were implanted in the femur of 36 rabbits for 2, 4 and 8 weeks. The expression of vascular endothelial growth factor (VEGF) and tumor necrosis factor alpha (TNF) in the induced membranes of implanted materials was analyzed with real-time quantitative polymerase chain reaction and compared with histology. Both uncoated BAG-S53P4 and PLGA-coated BAG-S53P4 increase expression of VEGF and TNF, resulting in higher amounts of capillary beds, compared with the lower expression of VEGF and less capillary beads observed for negative control and PMMA samples. A significantly higher expression of VEGF was observed for PLGA-coated BAG-S53P4 than for PMMA at 8 weeks (p <0.036). VEGF and TNF expression in the induced membrane of BAG-S53P4 and PLGA-coated BAG-S53P4 is equal or superior to PMMA, the "gold standard" material used in the induced-membrane technique. Furthermore, the VEGF and TNF expression for PLGA-coated BAG-S53P4 increased during follow-up.Peer reviewe

    Compatible natural gradient policy search

    Get PDF
    Trust-region methods have yielded state-of-the-art results in policy search. A common approach is to use KL-divergence to bound the region of trust resulting in a natural gradient policy update. We show that the natural gradient and trust region optimization are equivalent if we use the natural parameterization of a standard exponential policy distribution in combination with compatible value function approximation. Moreover, we show that standard natural gradient updates may reduce the entropy of the policy according to a wrong schedule leading to premature convergence. To control entropy reduction we introduce a new policy search method called compatible policy search (COPOS) which bounds entropy loss. The experimental results show that COPOS yields state-of-the-art results in challenging continuous control tasks and in discrete partially observable tasks

    S53P4 bioactive glass scaffolds induce BMP expression and integrative bone formation in a critical-sized diaphysis defect treated with a single-staged induced membrane technique

    Get PDF
    Surgical management of critical-sized diaphyseal defects involves multiple challenges, and up to 10% result in delayed or non-union. The two-staged induced membrane technique is successfully used to treat these defects, but it is limited by the need of several procedures and bone graft. Repeated procedures increase costs and morbidity, while grafts are subject to donor-site complications and scarce availability. To transform this two-staged technique into one graft-independent procedure, we developed amorphous porous scaffolds sintered from the clinically used bioactive glass S53P4. This work constitutes the first evaluation of such scaffolds in vivo in a critical-sized diaphyseal defect in the weight-bearing rabbit femur. We provide important knowledge and prospects for future development of sintered S53P4 scaffolds as a bone substitute. Critical-sized diaphysis defects are complicated by inherent sub-optimal healing conditions. The two staged induced membrane technique has been used to treat these challenging defects since the 1980 & rsquo;s. It involves temporary implantation of a membrane-inducing spacer and subsequent bone graft defect filling. A single-staged, graft-independent technique would reduce both socio-economic costs and patient morbidity. Our aim was to enable such single-staged approach through development of a strong bioactive glass scaffold that could replace both the spacer and the graft filling. We constructed amorphous porous scaffolds of the clinically used bioactive glass S53P4 and evaluated them in vivo using a critical sized defect model in the weight-bearing femur diaphysis of New Zealand White rabbits. S53P4 scaffolds and standard polymethylmethacrylate spacers were implanted for 2, 4, and 8 weeks. Induced membranes were confirmed histologically, and their osteostimulative activity was evaluated through RT-qPCR of bone morphogenic protein 2, 4, and 7 (BMPs). Bone formation and osseointegration were examined using histology, scanning electron microscopy, energy-dispersive X-ray analysis, and micro-computed tomography imaging. Scaffold integration, defect union and osteosynthesis were assessed manually and with X-ray projections. We demonstrated that S53P4 scaffolds induce osteostimulative membranes and produce osseointegrative new bone formation throughout the scaffolds. We also demonstrated successful stable scaffold integration with early defect union at 8 weeks postoperative in critical-sized segmental diaphyseal defects with implanted sintered amorphous S53P4 scaffolds. This study presents important considerations for future research and the potential of the S53P4 bioactive glass as a bone substitute in large diaphyseal defects. Statement of significance Surgical management of critical-sized diaphyseal defects involves multiple challenges, and up to 10% result in delayed or non-union. The two-staged induced membrane technique is successfully used to treat these defects, but it is limited by the need of several procedures and bone graft. Repeated procedures increase costs and morbidity, while grafts are subject to donor-site complications and scarce availability. To transform this two-staged technique into one graft-independent procedure, we developed amorphous porous scaffolds sintered from the clinically used bioactive glass S53P4. This work constitutes the first evaluation of such scaffolds in vivo in a critical-sized diaphyseal defect in the weight-bearing rabbit femur. We provide important knowledge and prospects for future development of sintered S53P4 scaffolds as a bone substitute. (c) 2021 The Author(s). Published by Elsevier Ltd on behalf of Acta Materialia Inc. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )Peer reviewe

    S53P4 bioactive glass scaffolds induce BMP expression and integrative bone formation in a critical-sized diaphysis defect treated with a single-stage d induce d membrane technique

    Get PDF
    Surgical management of critical-sized diaphyseal defects involves multiple challenges, and up to 10% result in delayed or non-union. The two-staged induced membrane technique is successfully used to treat these defects, but it is limited by the need of several procedures and bone graft. Repeated procedures increase costs and morbidity, while grafts are subject to donor-site complications and scarce availability. To transform this two-staged technique into one graft-independent procedure, we developed amorphous porous scaffolds sintered from the clinically used bioactive glass S53P4. This work constitutes the first evaluation of such scaffolds in vivo in a critical-sized diaphyseal defect in the weight-bearing rabbit femur. We provide important knowledge and prospects for future development of sintered S53P4 scaffolds as a bone substitute. Critical-sized diaphysis defects are complicated by inherent sub-optimal healing conditions. The two staged induced membrane technique has been used to treat these challenging defects since the 1980 & rsquo;s. It involves temporary implantation of a membrane-inducing spacer and subsequent bone graft defect filling. A single-staged, graft-independent technique would reduce both socio-economic costs and patient morbidity. Our aim was to enable such single-staged approach through development of a strong bioactive glass scaffold that could replace both the spacer and the graft filling. We constructed amorphous porous scaffolds of the clinically used bioactive glass S53P4 and evaluated them in vivo using a critical sized defect model in the weight-bearing femur diaphysis of New Zealand White rabbits. S53P4 scaffolds and standard polymethylmethacrylate spacers were implanted for 2, 4, and 8 weeks. Induced membranes were confirmed histologically, and their osteostimulative activity was evaluated through RT-qPCR of bone morphogenic protein 2, 4, and 7 (BMPs). Bone formation and osseointegration were examined using histology, scanning electron microscopy, energy-dispersive X-ray analysis, and micro-computed tomography imaging. Scaffold integration, defect union and osteosynthesis were assessed manually and with X-ray projections. We demonstrated that S53P4 scaffolds induce osteostimulative membranes and produce osseointegrative new bone formation throughout the scaffolds. We also demonstrated successful stable scaffold integration with early defect union at 8 weeks postoperative in critical-sized segmental diaphyseal defects with implanted sintered amorphous S53P4 scaffolds. This study presents important considerations for future research and the potential of the S53P4 bioactive glass as a bone substitute in large diaphyseal defects. Statement of significance Surgical management of critical-sized diaphyseal defects involves multiple challenges, and up to 10% result in delayed or non-union. The two-staged induced membrane technique is successfully used to treat these defects, but it is limited by the need of several procedures and bone graft. Repeated procedures increase costs and morbidity, while grafts are subject to donor-site complications and scarce availability. To transform this two-staged technique into one graft-independent procedure, we developed amorphous porous scaffolds sintered from the clinically used bioactive glass S53P4. This work constitutes the first evaluation of such scaffolds in vivo in a critical-sized diaphyseal defect in the weight-bearing rabbit femur. We provide important knowledge and prospects for future development of sintered S53P4 scaffolds as a bone substitute. (c) 2021 The Author(s). Published by Elsevier Ltd on behalf of Acta Materialia Inc. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )Peer reviewe

    An Algorithmic Perspective on Imitation Learning

    Get PDF
    As robots and other intelligent agents move from simple environments and problems to more complex, unstructured settings, manually programming their behavior has become increasingly challenging and expensive. Often, it is easier for a teacher to demonstrate a desired behavior rather than attempt to manually engineer it. This process of learning from demonstrations, and the study of algorithms to do so, is called imitation learning. This work provides an introduction to imitation learning. It covers the underlying assumptions, approaches, and how they relate; the rich set of algorithms developed to tackle the problem; and advice on effective tools and implementation. We intend this paper to serve two audiences. First, we want to familiarize machine learning experts with the challenges of imitation learning, particularly those arising in robotics, and the interesting theoretical and practical distinctions between it and more familiar frameworks like statistical supervised learning theory and reinforcement learning. Second, we want to give roboticists and experts in applied artificial intelligence a broader appreciation for the frameworks and tools available for imitation learning. We pay particular attention to the intimate connection between imitation learning approaches and those of structured prediction DaumĂ© III et al. [2009]. To structure this discussion, we categorize imitation learning techniques based on the following key criteria which drive algorithmic decisions: 1) The structure of the policy space. Is the learned policy a time-index trajectory (trajectory learning), a mapping from observations to actions (so called behavioral cloning [Bain and Sammut, 1996]), or the result of a complex optimization or planning problem at each execution as is common in inverse optimal control methods [Kalman, 1964, Moylan and Anderson, 1973]. 2) The information available during training and testing. In particular, is the learning algorithm privy to the full state that the teacher possess? Is the learner able to interact with the teacher and gather corrections or more data? Does the learner have a (typically a priori) model of the system with which it interacts? Does the learner have access to the reward (cost) function that the teacher is attempting to optimize? 3) The notion of success. Different algorithmic approaches provide varying guarantees on the resulting learned behavior. These guarantees range from weaker (e.g., measuring disagreement with the agent’s decision) to stronger (e.g., providing guarantees on the performance of the learner with respect to a true cost function, either known or unknown). We organize our work by paying particular attention to distinction (1): dividing imitation learning into directly replicating desired behavior (sometimes called behavioral cloning) and learning the hidden objectives of the desired behavior from demonstrations (called inverse optimal control or inverse reinforcement learning [Russell, 1998]). In the latter case, behavior arises as the result of an optimization problem solved for each new instance that the learner faces. In addition to method analysis, we discuss the design decisions a practitioner must make when selecting an imitation learning approach. Moreover, application examples—such as robots that play table tennis [Kober and Peters, 2009], programs that play the game of Go [Silver et al., 2016], and systems that understand natural language [Wen et al., 2015]— illustrate the properties and motivations behind different forms of imitation learning. We conclude by presenting a set of open questions and point towards possible future research directions for machine learning

    Femoral Stem Displacement in a Patient Suffering Recurrent Dislocations After Hip Hemiarthroplasty: Case Report

    Get PDF
    Displacement of the femoral component during attempt to closed reduction of a dislocated hip arthroplasty is an exceptionally rare, catastrophic event, which renders operative management obligatory. We report the proximal migration of a femoral stem during attempt to closed reduction in a patient with recurrent postoperative dislocations after hip hemiarthroplasty, and describe successful management by conversion to a standard total hip arthroplasty, retaining the same stem in the existing cement mantle. This illustrative case is reported not only as an extremely rare event, but also to highlight and discuss pitfalls and efficient measures in the management of this complex issue
    • 

    corecore